In the Claims 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

1. (Currently Amended) A machine-implemented method comprising: 

extracting portions from segment boundary regions of a plurality of speech 

segments, each segment boundary region based on a corresponding initial 
unit boundary; • 

creating feature vectors that represent the portions in a vector space; 

creating concatenation vectors in the vector space, each concatenation vector 
corresponding to unit boundaries of at least two segment boundarv 
regions, the at least two segment boundarv regions being of 
separate speech segments of the pluralitv of speech segments; 

for each of a plurality of potential unit boundaries within each segment boundary 
region, determining an average discontinuity based on distances between 
the feature vectors and the concatenation vectors, the average being over 
more than one of the pluralitv of speech segments ; and 

for each segment, selecting the potential unit boundary associated with a 
minimum average discontinuity as a new unit boundary. 

2. (Original) The machine-implemented method of claim 1, further comprising: 

if all of the new unit boundaries are the same as the corresponding initial unit 

boundaries, setting the new unit boundaries as final unit boundaries for the 
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segments. 

3. (Original) The machine-implemented method of claim 1, further comprising: 

if any of the new unit boundaries are different from the corresponding initial unit 

boundaries, iteratively: 

setting the new unit boundary as the initial unit boundary, and 
performing the extracting, the creating, the determining and the selecting, 

until all of the new unit boundaries are the same as the corresponding initial unit 

boundaries, 

4. (Original) The machine-implemented method of claim 1, wherein the average 

discontinuity is determined over a plurality of concatenations. 

5. (Original) The machine-implemented method of claim 1, wherein the initial unit 

boundary is in the middle of a phoneme. 

6. (Original) The machine-implemented method of claim 1, wherein each potential unit 

boundary defines two candidate units for each speech segment. 

7. (Currently Amended) The machine-implemented method of claim 6, wherein a 

concatenation of the-a_plurality of concatenations includes a candidate unit of a 
first segment linked to a candidate unit of a second segment. 



8. (Original) The machine-implemented method of claim 6, wherein the plurality of 
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concatenations includes all combinations of a first candidate unit of each segment 
with a second candidate unit of each segment. 

9. (Original) The machine-implemented method of claim 1, wherein the plurality of 

speech segments includes speech segments which end in the middle of a first 
phoneme, and speech segments which begin in the middle of a first phonenie. 

i 

10. (Original) The machine-implemented method of claim 9, wherein the plurality of 

i 

Speech segments are stored in a voice table. 

11. (Original) The machine-implemented method of claim 1, further comprising: 

recording speech input; and 

identifying the speech segments within the speech input. 

I 

12. (Original) The machine-implemented method of claim 1, wherein the portions include 

I 

centered pitch periods, the centered pitch periods derived from pitch periods of 
the segments. 

13. (Original) The machine-implemented method of claim 12, wherein the feature vectors 

incorporate phase information of the portions. 

1 

14. (Original) The machine-implemented method of claim 13, wherein creating feature 

vectors comprises: 

constructing a matrix Wfrom the portions; and 
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decomposing the matrix W. 



15, (Currently Ameneded) A machine-implemented method comprising: 

extracting portions from segment boundary regions of a plurality of speech 

segments, each segment boundary region based on a corresponding initial 
unit boundary; 

creating feature vectors that represent the portions in a vector space; 
for each of a plurality of potential unit boundaries within each segment boundary 
region, determining an average discontinuity based on distances between 
the feature vectors; and 
for each segment, selecting the potential unit boundary associated with a 

minimum average discontinuity as a new unit boundary; 
wherein the portions include centered pitch periods, the centered pitch periods 
derived from pitch periods of the segments, wherein the feature vectors 
incorporate phase information of the portions, wherein creating feature 
vectors comprises: 

constructing a matrix from the portions; and 
decomposing the matrix W, 
The machin e implemented method of claim 1^, and wherein the matrix W is a 
(2(Ar-l)+l)Af X matrix represented by 

w=ui:v^ 

where ^-1 is the number of centered pitch periods near the potential unit boundary 
extracted from each segment, N is the maximum number of samples among the 
centered pitch periods, M is the number of segments, U is the (2(Ar-l)+l)M x R 
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left singular matrix with row vectors w, (1 < i< (2(^-1 )+l)Af ), Z is the R xR 
diagonal matrix of singular values s\> S2> . . . > ^-r > 0, V is the iV x /? right 

singular matrix with row vectors vy (1 < y < AO, R « (2(iiL-l)+l)Af), and ^ 
denotes matrix transposition, wherein decomposing the matrix W comprises 

performing a singular value decomposition of W. 

16. (Original) The machine-implemented method of claim 15, wherein the centered pitch 

periods are symmetrically zero padded to samples. 

17. (Original) The machine-implemented method of claim 15, wherein a feature vector w/ 

is calculated as 

w/ = UiZ 

where w, is a row vector associated with a centered pitch period /, and 27 is the 
singular diagonal matrix. 

18. (Original) The machine-implemented method of claim 17, wherein the distance 

between two feature vectors is determined by a metric comprising a closeness 
measure, C, between two feature vectors, W/t and w/ , wherein C is calculated as 

I Uk ^ II II ui ^ II 

for any 1< ^, / < {2{K'W)M, 

19. (Original) The machine- implemented method of claim 18, wherein a discontinuity 

d{S\,S2) between two candidate units, S\ and 5*2, is calculated as 
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d(SuS2) = CiU;t-i,USo) + C(USo,Ua,)-C(U;r^uU7ro)-CiU(7o, 

ua\) 

where U 7[-\ is a feature vector associated with a centered pitch period ;r-i , U So 

is a feature vector associated with a centered pitch period ,U(j\ is a feature 

vector associated with a centered pitch period CJx^Ujto is a feature vector : 

associated with a centered pitch period ;ro , and M cjo is a feature vector associated 
with a centered pitch period G o . 

20. (Original) The machine-implemented method of claim 19, wherein the same 

closeness measure, C, is used for optimizing unit boundaries and for unit 
selection. 

21. (Currently Amended) A machine-readable storage medium having storing machine- 

executable instructions te -that when executed bv a machine cause a -the machine to 
perform a machine-implemented method comprising: 

extracting portions from segment boundary regions of a plurality of speech 
segments, each segment boundary region based on a corresponding 
initial unit boundary; 
creating feature vectors that represent the portions in a vector space; 
creating concatenation vectors in the vector space, each concatenation 
vector corresponding to unit boundaries of at least two segment 
boundary regions, each of the at least two segment boundary 
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regions being of separate speech segment of the plurality of speech 
segments; 

for each of a plurality of potential unit boundaries within each segment 
boundary region, determining an average discontinuity based on 
distances between the feature vectors and the concatenation 
vectors, the average being over more than one of the plurality of 
speech segments ; and 

for each segment, selecting the potential unit boundary associated with a 
minimum average discontinuity as a new unit boundary. 



22. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 

method further comprises: 

if all of the new unit boundaries are the same as the corresponding initial 
unit boundaries, setting the new unit boundaries as final unit boundaries 
for the segments, 

23. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 

method further comprises: 

if any of the new unit boundaries are different from the corresponding 
initial unit boundaries, iteratively: 

setting the new unit boundary as the initial unit boundary, and 
performing the extracting, the creating, the determining and the 
selecting, 

until all of the new unit boundaries are the same as the corresponding 
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initial unit boundaries. 



24. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 

average 

discontinuity is determined over a plurality of concatenations. 

25. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 
initial unit 

boundary is in the middle of a phoneme. 

26. (Currently Amended) The machine-readable storage medium of claim 21, wherein each 
potential unit 

boundary defines two candidate units for each speech segment. 

27. (Currently Amended) The machine-readable storage medium of claim 26, wherein a 

concatenation of tfie-a_plurality of concatenations includes a candidate unit of a 
first segment linked to a candidate unit of a second segment. 

28. (Currently Amended) The machine-readable storage medium of claim 26, wherein the 
plurality of 

concatenations includes all combinations of a first candidate unit of each segment 
with a second candidate unit of each segment. 
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29. (Currently Amended) The machine-readable storage medium of claim 21, wherjein the 
plurality of speech 

segments includes speech segments which end in the middle of a first phoneme, 
and speech segments which begin in the middle of a first phoneme. j 

i 

30. (Currently Amended) The machine-readable storage medium of claim 29, wherein the 
plurality of speech 

segments are stored in a voice table. . 

31. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 
method further 

comprises: 

recording speech input; and 

identifying the speech segments within the speech input. 

32. (Currently Amended) The machine-readable storage medium of claim 21, wherein the 

portions include centered pitch periods, the centered pitch periods derived from 
pitch periods of the segments. 

33. (Currently Amended) The machine-readable storage medium of claim 32, wherein the 

i 

feature vectors incorporate phase information of the portions. 

34. (Currently Amended) The machine-readable storage medium of claim 33, wherein 
creating feature 
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vectors comprises: 

constructing a matrix \y from the portions; and 
decomposing the matrix W. 



35. (Currently Amended) A machine-readable storage medium having machine- 
executable instructions that when executed by a machine cause the machine to 
perform a machine-implemented method comprising: 

extracting portions from segment boundary regions of a plurality of speech 
segments, each segment boundary region based on a corresponding 
initial unit boundary: 
creating feature vectors that represent the portions in a vector space: 
for each of a plurality of potential unit boundaries within each segment 
boundary region, determining an average discontinuity based on 
distances between the feature vectors: and 
for each segment, selecting the potential unit boundary associated with a 

minimum average discontinuity as a new unit boundary: 
wherein the portions include centered pitch periods, the centered pitch periods 
derived from pitch periods of the segments, wherein the feature vectors 
incorporate phase information of the portions, wherein creating feature 
vectors comprises: 

constructing a matrix Wfrom the portions: and 
decomposing the matrix VK 
The machine readable medium of el and aim 3 4 , wherein the matrix W is a 
(2(^-1 )+l)M X matrix represented by 
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where K-l is the number of centered pitch periods near the potential unit boundary 
extracted from each segment, is the maximum number of samples among the 
centered pitch periods, M is the number of segments, U is the (2(/iC-l)+l)M x R 
left singular matrix with row vectors m, (1 < / < {l{K-\)+\)M ), Z is the /? x /? 
diagonal matrix of singular values > S2> . . . > 5r > 0, V is the x /? right 
singular matrix with row vectors vy {l< j< N), R « (2(^-l)+l)M), and ^ 
denotes matrix transposition, wherein decomposing the matrix W comprises 
performing a singular value decomposition of W, 

36. (Currently Amended) The machine-readable storage medium of claim 35, wherein the 
centered pitch 

periods are symmetrically zero padded to N samples. 

37. (Currently Amended) The machine-readable storage medium of claim 35, wherein a 
feature vector w/ is 

calculated as 

w/ = UiZ 

where Ui is a row vector associated with a centered pitch period /, and Z is the 
singular diagonal matrix. 

38. (Currently Amended) The machine-readable storage medium of claim 37, wherein the 
distance between 

two feature vectors is determined by a metric comprising a closeness measure, C, 
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between two feature vectors, wjt and iii , wherein C is calculated as 
C{Uk , W/) = cos(Mjti; W/2) = |j n 

II W/t ^ II II M/ ^ II 

for any \<kj< {2{KA)-\-\)M. 

39. (Currently Amended) The machine-readable storage medium of claim 38, wherein a 
discontinuity 

d{S\yS2) between two candidate units, S\ and ^2, is calculated as 

^(•51,52) = C( M;r-i, M(5o) + C( l<(5o» McTi) - C( M;r-i, M;ro) - C( Mao, 

MCTi) 

where Ujc^x is a feature vector associated with a centered pitch period ;r-i , U So 

is a feature vector associated with a centered pitch period So ,U(j\ is a feature 

vector associated with a centered pitch period Gx.Ujio is a feature vector 

associated with a centered pitch period TTo , and M cJo is a feature vector 
associated with a centered pitch period CTo . 

40. (Currently Amended) The machine-readable storage medium of claim 39, wherein the 
same closeness 

measure, C, is used for optimizing unit boundaries and for unit selection. 

41. (Currently Amended) An apparatus comprising: 

means for extracting portions from segment boundary regions of a plurality of 
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speech segments, each segment boundary region based on a corresponding 
initial unit boundary; 

means for creating feature vectors that represent the portions in a vector space; 

means for creating concatenation vectors in the vector space, each concatenation 
vector corresponding to unit boundaries of at least two segment boundarv 
regions, the at least two segment boundary regions being of 
separate speech segments of the plurality of speech segments; 

for each of a plurality of potential unit boundaries within each segment 

boundary region, means for determining an average discontinuity based on 
distances between the feature vectors and the concatenation vectors^ the 
average being over more than one of the plurality of speech segments ; and 

for each segment, means for selecting the potential unit boundary associated with 
a minimum average discontinuity as a new unit boundary. 

42. (Original) The apparatus of claim 41, further comprising: 

if all of the new unit boundaries are the same as the corresponding initial unit 
boundaries, means for setting the new unit boundaries as final unit boundaries for 
the segments. 

43. (Original) The apparatus of claim 41, further comprising: 

if any of the new unit boundaries are different from the corresponding initial unit 
boundaries, means for iteratively: 

setting the new unit boundary as the initial unit boundary, and 
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performing the extracting, the creating, the determining and the selecting, 
until all of the new unit boundaries are the same as the corresponding initial unit 
boundaries. 

44. (Original) The apparatus of claim 41, wherein the average discontinuity is determined 

over a plurality of concatenations. 

45. (Original) The apparatus of claim 41, wherein the initial unit boundary is in the 

middle of a phoneme. 

46. (Original) The apparatus of claim 41, wherein each potential unit boundary defines 

two candidate units for each speech segment. 

47. (Currently Amended) The apparatus of claim 46, wherein a concatenation of tlie-a 

i 

plurality of concatenations includes a candidate Unit of a first segment linked to a 
candidate unit of a second segment. 

i 

48. (Original) The apparatus of claim 46, wherein the plurality of concatenations includes 

i 

all combinations of a first candidate unit of each segment with a second candidate 
unit of each segment. 

I 

49. (Original) The apparatus of claim 41, wherein the plurality of speech segments 

includes speech segments which end in the middle of a first phoneme, andjspeech 
segments which begin in the middle of a first phoneme. 

i 
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50. (Original) The apparatus of claim 49, wherein the plurality of speech segments are 

stored in a voice table. 

51 . (Original) The apparatus of claim 41, further comprising: 

means for recording speech input; and 

means for identifying the speech segments within the speech input. 

I 

52. (Original) The apparatus of claim 41, wherein the portions include centered pitth 

periods, the centered pitch periods derived from pitch periods of the segments. 

53. (Original) The apparatus of claim 52, wherein the feature vectors incorporate phase 

information of the portions. 

54. (Original) The apparatus of claim 53, wherein creating feature vectors comprises: 

means for constructing a matrix Wfxom the portions; and 
means for decomposing the matrix W, 

55. (Currently Amended) An apparatus comprising: 

means for extracting portions from segmerit boiindarv regions of a plurality of 

speech segments, each segment boundary region based on a corresponding 
initial unit boundarv: 

means for creating feature vectors that represent the portions in a vector space; 

for each of a. plurality of potential unit boundaries within each segment 
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boundary region, means for determining an average discontinuity based on 
distances between the feature vectors; and 
for each segment, means for selecting the potential unit boundary associated with 

a minimum average discontinuity as a new unit boundary, 
wherein the portions include centered pitch periods, the centered pitch periods 
derived from pitch periods of the segments, wherein the feature vectors 
incorporate phase information of the portions, wherein creating feature 
vectors comprises: 

means for constructing a matrix Wfrom the portions: and 
means for decomposing the matrix VK 
The apparatus of claim 5^, and wherein the matrix Wis a (2(Ar-l)+l)M x 
matrix represented by 

where K-\ is the number of centered pitch periods near the potential unit boundary 
extracted from each segment, is the maximum number of samples among the 
centered pitch periods, M is the number of segments, JJ is the (2(Ar-l)+l)M x R 
left singular matrix with row vectors w/ (1 < / < (2(A:-1)+1)M ), T is the /? x /? 
diagonal matrix of singular values > S2> . . . > a-r > 0, V is the x /? right 
singular matrix with row vectors v, (l< j< N), R « (2(KA)-\-l)M), and ^ 
denotes matrix transposition, wherein decomposing the matrix W comprises 
performing a singular value decomposition of W, 

56. (Original) The apparatus of claim 55, wherein the centered pitch periods are 
synimetrically zero padded to N samples. 
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57. (Original) The apparatus of claim 55, wherein a feature vector w/ is calculated as 

where Ui is a row vector associated with a centered pitch period /, and Z is the 
singular diagonal matrix. 

58. (Original) The apparatus of claim 57, wherein the distance between two feature 

vectors is determined by a metric comprising a closeness measure, C, between 
two feature vectors, ilk and w/ , wherein C is calculated as 

C{uk , ui) = cos(w^2; = ^1 — .^^ 

||..^|| ||..^|| 

for any \<kj< {2{KA)-^\)M, 

59. (Original) The apparatus of claim 58, wherein a discontinuity d{S\,S2) between two 

candidate units, S\ and 52, is calculated as 

d{Sx.S2)^C{U7t~x.Udo) + C{USo.Uax)-C{Un-x.UKo)-C(uao. 

where Ujt-x is a feature vector associated with a centered pitch period ;r-j , U So 

is a feature vector associated with a centered pitch period , M 0*1 is a feature 

vector associated with a centered pitch period Gx.Ujcq is a feature vector 

associated with a centered pitch period ;ro , and U cjo is a feature vector associated 
with a centered pitch period (7o , 
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60. (Original) The apparatus of claim 59, wherein the same closeness measure, C, is used 
for optimizing unit boundaries and for unit selection. 



61. (Currently Amended) A system comprising: 

a processing unit coupled to a memory through a bus; and 
a process memory unit storing a process executed from die memory by the 
processing unit to cause the processing unit to: 

extract portions from segment boundary regions of a plurality of 

speech segments, each segment boundary region based on a 
corresponding initial unit boundary; 

create feature vectors that represent the portions in a vector space; 

create concatenation vectors in the vecioi space, each 

concatenation vector corresponding to unit boundaries of at 
least two segment boundary regions, each of the at least two 
segment boundary regions being of separate speech 
segment of the plurality of speech segment ; 

for each of a plurality of potential unit boundaries within each 
segment boundary region, determine an average 
discontinuity based on distances between the feature 
vectors and the concatenation vectors, the average being 
over more than one of the plurality of speech segments : and 

for each segment, select the potential unit boundary associated with 
a minimum average discontinuity as a new unit boundary. 
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62. (Original) The system of claim 61, wherein the process further causes the processing unit 
to: 

if all of the new unit boundaries are the same as the corresponding initial unit 
boundaries, set the new unit boundaries as final unit boundaries for the segments. 

i 

63. (Original) The system of claim 61, wherein the process further causes the processing 

unit to: 

if any of the new unit boundaries are different from the corresponding 

initial unit boundaries, iteratively: 

set the new unit boundary as the initial unit boundary, and i 
perform the extracting, the creating, the determining and thes 
selecting, 

until all of the new unit boundaries are the same as the corresponding 
initial unit boundaries. 

64. (Original) The system of claim 61, wherein the average discontinuity is determined 

over a plurality of concatenations. 

65. (Original) The system of claim 61, wherein the initial unit boundary is in the riiiddle 

of a phoneme. 

66. (Original) The system of claim 61, wherein each potential unit boundary defines two 

candidate units for each speech segment. i 
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67. (Currently Amended) The system of claim 66, wherein a concatenation of the-a 

plurality of concatenations includes a candidate unit of a first segment linked to a 
candidate unit of a second segment. 

68. (Original) The system of claim 66, wherein the plurality of concatenations includes all 

combinations of a first candidate unit of each segment with a second candidate 
unit of each segment. 

69. (Original) The system of claim 61, wherein the plurality of speech segments includes 

speech segments which end in the middle of a first phoneme, and speech segments 
which begin in the middle of a first phoneme. 

70. (Original) The system of claim 69, wherein the plurality of speech segments are 

stored in a voice table. 

71. (Original) The system of claim 61, wherein the process further causes the processing 

unit to: 

record speech input; and 

identify the speech segments within the speech input. 

72. (Original) The system of claim 61, wherein the portions include centered pitch 

periods, the centered pitch periods derived from pitch periods of the segments. 
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73. (Original) The system of claim 72, wherein the feature vectors incorporate phase 
information of the portions. 



74. (Original) The system of claim 73, wherein the process further causes the processing 
unit, when creating feature vectors, to: 

construct a matrix from the portions; and 
decompose the matrix W, 



75. (Currently Amended) A system comprising: 

a processing unit coupled to a memory through a bus; and 
a memory unit storing a process executed by the processing unit to cause the 
processing unit to: 

extract portions from segment boundary regions of a plurality of 

speech segments, each segment boundary region based on a 
corresponding initial unit boundary; 
create feature vectors that represent the portions in a vector space; 
for each of a plurality of potential unit boundaries within each 
segment boundary region, determine an average 
discontinuity based on distances between the feature 
vectors; and 

for each segment, select the potential unit boundary associated with 
a minimum average discontinuity as a new unit boundary, 
wherein the portions include centered pitch periods, the centered pitch periods 
derived from pitch periods of the segments, wherein the feature vectors 
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incorporate phase information of the portions, wherein the process further 
causes the processing unit, when creating feature vectors, to: 

construct a matrix Wfrom the portions: and 

decompose the matrix 
The system of claim 7 4 , and wherein the matrix W is a (2(/l-1)+1)M x matrix 
represented by 

where K-l is the number of centered pitch periods near the potential unit boundary 
extracted from each segment, N is the maximum number of samples 
among the centered pitch periods, M is the number of segments, U is the (2{K' 
1)+1)M X ^ left singular matrix with row vectors w/ (1 < / < (2(K-l)+l)M ), Z is 
the RxR diagonal matrix of singular values s\> S2> . . . > .sr > 0, V is the x /? 
right singular matrix with row vectors v; (1 < j< AO, R « (2(Ar-l)+l)Af), and ^ 
denotes matrix transposition, wherein decomposing the matrix W comprises 
performing a singular value decomposition of W, 

76, (Original) The system of claim 75, wherein the centered pitch periods are 

symmetrically zero padded to A^ samples. 

77. (Original) The system of claim 75, wherein a feature vector w/ is calculated as : 

where w/ is a row vector associated with a centered pitch period /, and Z is the 
singular diagonal matrix. 
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78. (Original) The system of claim 77, wherein the distance between two feature vectors 

is determined by a metric comprising a closeness measure, C, between two feature 
vectors, Uk and w/ , wherein C is calculated as 

II II II II 

for any \<kj< (2(/^-l)+l)M. 

79. (Original) The system of claim 78, wherein a discontinuity diSuSi) between two 

candidate units, S\ and 52, is calculated as 

d(SuS2)^C(U7r-\.USo)'^C(UdQ.Uai)-C(U7[-^,U7[o)-C{Uao. 

Ugx) 

where Ujt-x is a feature vector associated with a centered pitch period ;r-i , U So 

is a feature vector associated with a centered pitch period So ,UcT\ is a feature 

vector associated with a centered pitch period 0"i , M/To is a feature vector 

associated with a centered pitch period ;ro , and M tjo is a feature vector associated 
with a centered pitch period (7o. 

80. (Original) The system of claim 79, wherein the same closeness measure, C, is used for 

optimizing unit boundaries and for unit selection. 

81. (Currently Amended) A machine- implemented method comprising: ! 

setting an initial unit boundary for each segment of a plurality of speech segments, 
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each initial unit boundary defining a segment boundary region and a 
plurality of potential unit boundaries within each segment boundary 
region; 

creating feature vectors corresponding to each segment boundary region in a 
vector space: 

creating concatenation vectors in the vector space, each concatenation vector 
corresponding to Unit boundaries of at least two segment boundary 
regions, the at least two segment boundary regions being of 
separate speech segments of the plurality of speech segments: 

for each s e gment potential unit boundary , determining an average discontinuity 
based on distances between the feature vectors and the concatenation 
vectors, the average being over a plurality of concatenations of candidate 
units defined by the potential unit boundaries more than one of the plurality 
of speech segments : 

for each segment, selecting the potential unit boundary associated with a 
minimum average discontinuity as a new unit boundary. 

82. (Original) The machine-implemented method of claim 81, further comprising 
iteratively performing: 

for each segment, setting die new unit boundary as the initial unit : 

boundary; and 
performing the determining and the selecting, 
until all of the new unit boundaries for each segment are the same as the 
corresponding initial unit boundaries for each segment. 
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83. (Original) The machine-implemented method of claim 82, wherein determining the 
average discontinuity comprises: 

constructing a matrix from time-domain samples of segment boundary 

regions; and 
decomposing the matrix. 



84. (Original) The machine- iniplemented method of claim 83, wherein the time-domain 

samples include centered pitch periods. , 

85. (Currently Amended) A machine-readable storage medium having storing machine- 

executable instructions te that when executed bv a machine cause a -the machine to 

perform a machine-implemented method comprising: 

setting an initial unit boundary for each segment of a plurality of sp;eech 
segments, each initial unit boundary defining a segment boundary 
region and a plurality of potential unit boundaries within each 
segment boundary region; 

t 
I 

creating feature vectors corresponding to each segment boundary region in 
a vector space: 

creating concatenation vectors in the vector space, each concatenation 

i 

vector corresponding to unit boundaries of at least two segrnent 
boundarv regions, each of the at least two segment boundary 
regions being of separate speech segment of the plurality of speech 
segments; 
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for each segmen t potential unit boundary , determining an average 

discontinuity based on distances between the feature vectors and 
the concatenation vectors, the average being ove r a plurality of 
concatenations of candidate units defined by the potential unit 
boundarie s more than one of the plurality of speech segments : 

for each segment, selecting the potential unit boundary associated with a 
minimum average discontinuity as a new unit boundary. 

86. (Currently Amended) The machine-readabl e storage medium of claim 85, the method 

further comprising 

iteratively performing: 

for each segment, setting the new unit boundary as the initial unit 

boundary; and 
performing the determining and the selecting, 
until all of the new unit boundaries for each segment are the same as the 
corresponding initial unit boundaries for each segment. 

87. (Currently Amended) The machine-readable storage medium of claim 86, wherein 

determining the average discontinuity comprises: 

constructing a matrix from time-domain samples of segment boundary 

regions; and 
decomposing the matrix. 



88. (Currently Amended) The machine-readable storage medium of claim 87, wherein the 
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time-domain samples include centered pitch periods. 



89. (Currently Amended) An apparatus comprising: 

means for setting an initial unit boundary for each segment of a plurality of speech 
segments, each initial unit boundary defining a segment boundary region 
and a plurality of potential unit boundaries within each segment boundary 
region; 

means for creating feature vectors corresponding to each segment boundary region 
in a vector space; 

means for creating concatenation vectors in the vector space, each concatenation 
vector corresponding to unit boundaries of at least two segment boundary 
regions, the at least two segment boundary regions being of 
separate speech segments of the plurality of speech segments; 

for each segmen t potential unit boundary , means for determining an average 
discontinuity based on distances between the feature vectors and the 
concatenation vectors, the average being over a plurality of concatenations 
of candidate units defined by the potential unit boundaries more than one 
of the plurality of speech segments ; 

for each segment, means for selecting the potential unit boundary associated with 
a minimum average discontinuity as a new unit boundary. 

90. (Original) The apparatus of claim 89, further comprising means for iteratively 

performing: 

for each segment, means for setting the new unit boundary as the initial 
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unit boundary; and 
means for performing the determining and the selecting, 
until all of the new unit boundaries for each segment are the same as the 
corresponding initial unit boundaries for each segment. 

91. (Original) The apparatus of claim 90, wherein determining the average discontinuity 

comprises: 

means for constructing a matrix from time-domain samples of segment 

boundary regions; and 
means for decomposing the matrix. 

92. (Original) The apparatus of claim 91, wherein the time-domain samples include 

centered pitch periods, 

93. (Currently Amended) A system comprising: 

a processing unit coupled to a memory through a bus; and 
a memory unit storing a process executed from the memory by the processing unit 
to cause the processing unit to: 

set an initial unit boundary for each segment of a plurality of 
speech segments, each initial unit boundary defining a 
segment boundary region and a plurality of potential unit 
boundaries within each segment boundary region; 
create feature vectors corresponding to each segment boundary 
region in a vector space; 
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create concatenation vectors in the vector space, each 

concatenation vector corresponding to unit boundaries of at 
least two segment boundary regions, the at least two 
segment boundary regions being of separate speech 
segments of the plurality of speech segments; 

for each segmen t potential unit boundary , determine an average 
discontinuity based on distances between the feature 
vectors and the concatenation vectors, the average being 
over a plurality of concatenations of candidate unit s defined 
by the potential unit boundaries more than one of the 
plurality of speech segments ; 

for each segment, select the potential unit boundary associated with 
a minimum average discontinuity as a new unit boundary. 

94. (Original) The system of claim 93, wherein the process further causes the processing 

unit to iteratively: 

for each segment, set the new unit boundary as the initial unit boundary; 
and 

perform the determining and the selecting, 
until all of the new unit boundaries for each segment are the same as the 
corresponding initial unit boundaries for each segment. 

95. (Original) The system of claim 94, wherein the process further causes the processing 

unit, when determining the average discontinuity, to: 
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construct a matrix from time-domain samples of segment boundary 

regions; and 
decompose the matrix. 

96. (Original) The system of claim 95, wherein the time-domain samples include centered 
pitch periods. 
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